Machine-learning paradigms for selecting ecologically significant input variables
نویسندگان
چکیده
Harmful algal blooms, which are considered a serious environmental problem nowadays, occur in coastal waters in many parts of the world. They cause acute ecological damage and ensuing economic losses, due to fish kills and shellfish poisoning as well as public health threats posed by toxic blooms. Recently, data-driven models including machine learning (ML) techniques have been employed to mimic dynamics of algal blooms. One of the most important steps in the application of a ML technique is the selection of significant model input variables. In the present paper, we use two extensively used ML techniques, artificial neural networks (ANN) and genetic programming (GP) for selecting the significant input variables. The efficacy of these techniques is first demonstrated on a test problem with known dependence and then they are applied to a real-world case study of water quality data from Tolo Harbour, Hong Kong. These ML techniques overcome some of the limitations of the currently used techniques for input variable selection, a review of which is also presented. The interpretation of the weights of the trained ANN and the GP evolved equations demonstrate their ability to identify the ecologically significant variables precisely. The significant variables suggested by the ML techniques also indicate chlorophyll-a itself to be the most significant input in predicting the algal blooms, suggesting an auto-regressive nature or persistence in the algal bloom dynamics, which may be related to the long flushing time in the semi-enclosed coastal waters. The study also confirms the previous understanding that the algal blooms in coastal waters of Hong Kong often occur with a life cycle of the order of 1 2 weeks.
منابع مشابه
Machine learning algorithms for time series in financial markets
This research is related to the usefulness of different machine learning methods in forecasting time series on financial markets. The main issue in this field is that economic managers and scientific society are still longing for more accurate forecasting algorithms. Fulfilling this request leads to an increase in forecasting quality and, therefore, more profitability and efficiency. In this pa...
متن کاملRenyi's-entropy-based Approach for Selecting the Significant Input Variables for the Ecological data
Recently, data-driven approaches including machine-learning (ML) techniques have played a key role in the research on ecological data and models. One of the most important steps in the application of a ML technique is the selection of significant model input variables. Among ML methods, artificial neural networks and genetic algorithm are widely used for the sake of the above aim; however entro...
متن کاملFault Detection of Anti-friction Bearing using Ensemble Machine Learning Methods
Anti-Friction Bearing (AFB) is a very important machine component and its unscheduled failure leads to cause of malfunction in wide range of rotating machinery which results in unexpected downtime and economic loss. In this paper, ensemble machine learning techniques are demonstrated for the detection of different AFB faults. Initially, statistical features were extracted from temporal vibratio...
متن کاملConstruction of Specific Machine Learning Paradigms from a Primitive-Based Generic Machine Learning Model
This paper is about the construction of various specific machine learning paradigms using a primitive-based generic machine learning model. The generic model identifies five functional components involved in a machine learning process, including an input, a transformation, a control, an output, and a knowledge base. It also identifies a set of basic machine learning mechanisms for each componen...
متن کاملMachine learning algorithms in air quality modeling
Modern studies in the field of environment science and engineering show that deterministic models struggle to capture the relationship between the concentration of atmospheric pollutants and their emission sources. The recent advances in statistical modeling based on machine learning approaches have emerged as solution to tackle these issues. It is a fact that, input variable type largely affec...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Eng. Appl. of AI
دوره 20 شماره
صفحات -
تاریخ انتشار 2007